Confidence-Based Feature Acquisition to Minimize Training and Test Costs
نویسندگان
چکیده
We present Confidence-based Feature Acquisition (CFA), a novel supervised learning method for acquiring missing feature values when there is missing data at both training and test time. Previous work has considered the cases of missing data at training time (e.g., Active Feature Acquisition, AFA [8]), or at test time (e.g., Cost-Sensitive Naive Bayes, CSNB [2]), but not both. At training time, CFA constructs a cascaded ensemble of classifiers, starting with the zero-cost features and adding a single feature for each successive model. For each model, CFA selects a subset of training instances for which the added feature should be acquired. At test time, the set of models is applied sequentially (as a cascade), stopping when a user-supplied confidence threshold is met. We compare CFA to AFA, CSNB, and several other baselines, and find that CFA’s accuracy is at least as high as the other methods, while incurring significantly lower feature acquisition costs.
منابع مشابه
Pattern Recognition in Control Chart Using Neural Network based on a New Statistical Feature
Today for the expedition of the identification and timely correction of process deviations, it is necessary to use advanced techniques to minimize the costs of production of defective products. In this way control charts as one of the important tools for the statistical process control in combination with modern tools such as artificial neural networks have been used. The artificial neural netw...
متن کاملActive Feature Acquisition with Supervised Matrix Completion
Feature missing is a serious problem in many applications, which may lead to low quality of training data and further significantly degrade the learning performance. While feature acquisition usually involves special devices or complex process, it is expensive to acquire all feature values for the whole dataset. On the other hand, features may be correlated with each other, and some values may ...
متن کاملFeature-Budgeted Random Forest
We seek decision rules for prediction-time cost reduction, where complete data is available for training, but during prediction-time, each feature can only be acquired for an additional cost. We propose a novel random forest algorithm to minimize prediction error for a user-specified average feature acquisition budget. While random forests yield strong generalization performance, they do not ex...
متن کاملThe Acquisition of Definiteness Feature by Persian L2 Learners of English
The definiteness feature in English is both LF and PF interpretable while Persian is a language in which this feature is LF-interpretable but PF-uninterpretable. Hence, there is no overt article or morphological inflection in Persian denoting a definite context. Furthermore, Persian partially encodes specificity not definiteness. In definiteness both the speaker and hearer are involved while in...
متن کاملNeuro-Fuzzy Based Algorithm for Online Dynamic Voltage Stability Status Prediction Using Wide-Area Phasor Measurements
In this paper, a novel neuro-fuzzy based method combined with a feature selection technique is proposed for online dynamic voltage stability status prediction of power system. This technique uses synchronized phasors measured by phasor measurement units (PMUs) in a wide-area measurement system. In order to minimize the number of neuro-fuzzy inputs, training time and complication of neuro-fuzzy ...
متن کامل